---
title: "PDFToImageContent"
id: pdftoimagecontent
slug: "/pdftoimagecontent"
description: "`PDFToImageContent` reads local PDF files and converts them into `ImageContent` objects. These are ready for multimodal AI pipelines, including tasks like image captioning, visual QA, or prompt-based generation."
---

# PDFToImageContent

`PDFToImageContent` reads local PDF files and converts them into `ImageContent` objects. These are ready for multimodal AI pipelines, including tasks like image captioning, visual QA, or prompt-based generation.

<div className="key-value-table">

|  |  |
| --- | --- |
| **Most common position in a pipeline** | Before a `ChatPromptBuilder` in a query pipeline                                                        |
| **Mandatory run variables**            | `sources`: A list of PDF file paths or ByteStreams                                                      |
| **Output variables**                   | `image_contents`: A list of ImageContent objects                                                        |
| **API reference**                      | [Image Converters](/reference/image-converters-api)                                                            |
| **GitHub link**                        | https://github.com/deepset-ai/haystack/blob/main/haystack/components/converters/image/pdf_to_image.py |

</div>

## Overview

`PDFToImageContent` processes a list of PDF sources and converts them into `ImageContent` objects, one for each page of the PDF. These can be used in multimodal pipelines that require base64-encoded image input.

Each source can be:

- A file path (string or `Path`), or
- A `ByteStream` object.

Optionally, you can provide metadata using the `meta` parameter. This can be a single dictionary (applied to all images) or a list matching the length of `sources`.

Use the `size` parameter to resize images while preserving aspect ratio. This reduces memory usage and transmission size, which is helpful when working with remote models or limited-resource environments.

This component is often used in query pipelines just before a `ChatPromptBuilder`.

## Usage

### On its own

```python
from haystack.components.converters.image import PDFToImageContent

converter = PDFToImageContent()

sources = ["file.pdf", "another_file.pdf"]

image_contents = converter.run(sources=sources)["image_contents"]
print(image_contents)

## [ImageContent(base64_image='...',
## mime_type='application/pdf',
## detail=None,
## meta={'file_path': 'file.pdf', 'page_number': 1}),
## ...]
```

### In a pipeline

Use `ImageFileToImageContent` to supply image data to a `ChatPromptBuilder` for multimodal QA or captioning with an LLM.

```python
from haystack import Pipeline
from haystack.components.builders import ChatPromptBuilder
from haystack.components.generators.chat import OpenAIChatGenerator
from haystack.components.converters.image import PDFToImageContent

## Query pipeline
pipeline = Pipeline()
pipeline.add_component("image_converter", PDFToImageContent(detail="auto"))
pipeline.add_component(
    "chat_prompt_builder",
    ChatPromptBuilder(
        required_variables=["question"],
        template="""{% message role="system" %}
You are a helpful assistant that answers questions using the provided images.
{% endmessage %}

{% message role="user" %}
Question: {{ question }}

{% for img in image_contents %}
{{ img | templatize_part }}
{% endfor %}
{% endmessage %}
"""
    )
)
pipeline.add_component("llm", OpenAIChatGenerator(model="gpt-4o-mini"))

pipeline.connect("image_converter", "chat_prompt_builder.image_contents")
pipeline.connect("chat_prompt_builder", "llm")

sources = ["flan_paper.pdf"]

result = pipeline.run(
    data={
        "image_converter": {"sources": ["flan_paper.pdf"], "page_range":"9"},
        "chat_prompt_builder": {"question": "What is the main takeaway of Figure 6?"}
    }
)
print(result["replies"][0].text)

## ('The main takeaway of Figure 6 is that Flan-PaLM demonstrates improved '
## 'performance in zero-shot reasoning tasks when utilizing chain-of-thought '
## '(CoT) reasoning, as indicated by higher accuracy across different model '
## 'sizes compared to PaLM without finetuning. This highlights the importance of '
## 'instruction finetuning combined with CoT for enhancing reasoning '
## 'capabilities in models.')

```

## Additional References

🧑‍🍳 Cookbook: [Introduction to Multimodality](https://haystack.deepset.ai/cookbook/multimodal_intro)
